Closes #71 rounding #72

kaz462 · 2023-07-25T04:32:18Z

No description provided.

manciniedoardo

Great article @kaz462 - I learned a lot about different ways to round 😄 I left some review comments.

posts/2023-07-24_rounding/rounding.qmd

manciniedoardo · 2023-07-27T12:52:34Z

posts/2023-07-24_rounding/rounding.qmd

+  repo_spec = "pharmaverse/blog",
+  name = long_slug
+)
+```


This is a minor thing, and perhaps you don't want to do this, but I thought I would mention anyway - would it be worth adding a section looking a the bigger picture, and whether how far we should be trying to go to match SAS rounding? I'm thinking about the fact that if we are trying to move to R-based submissions, then some of this stuff should in a way be expected and accepted. My idea comes from this comment by @rossfarrugia .

when moving from SAS to R submission, what kind of discrepancy are expected/accepted for rounding?

to add context to my comment before... when using same language for double programming QC often teams would strive for a 100% perfect comparison match, whereas when working in a multi-lingual world maybe it's ok to not perfectly match if differences can be explained in audit-ready QC evidence to explain differences are down to the use of different languages. e.g. a % in first line output with R is 1% and in QC with SAS it is 2%, and upon inspection the QC'er sees that the actual value is 1.5% rounded, hence just add a comment in QC evidence to explain such differences are considered acceptable.

e.g. a % in first line output with R is 1% and in QC with SAS it is 2%, and upon inspection the QC'er sees that the actual value is 1.5% rounded, hence just add a comment in QC evidence to explain such differences are considered acceptable.

This discrepancy is not caused by SAS/R, but by different rounding approaches, which I think should be based on the study SAP instead of the software. Should the rounding method be clarified first, then choose the corresponding functions in SAS/R?

I'd doubt most SAPs would go into such detail as nobody really ever questioned rounding approaches when we only used single language. It's a fair point though. I feel like a closing note such as "Note: with the differences in default behaviour across languages, you could consider your QC strategy and whether an acceptable level of fuzz in the electronic comparisons could be allowed for cases such as rounding when making comparisons between 2 codes written in different languages as long as this is documented. Alternatively you could document the exact rounding approach to be used in the SAP and then match this regardless of programming language used."

Good summary, I will add it next week after the holiday.
The reason I prefer to align with the rounding method during QC is that when we accept the difference, it's easy to overlook the differences caused by other reasons., e.g., RConsortium/submissions-pilot3-adam#99

Thanks @rossfarrugia @clarkliming for your comments and discussion!

bms63

omg this rounding numbers graphic is amazing!!

bms63 · 2023-07-30T19:19:42Z

CAn we increase the description here. Feels kind of empty

bms63 · 2023-07-30T19:21:16Z

I think we might want to mention SAS's rounde function, which matches R's round function

bms63 · 2023-07-30T19:23:33Z

It might be nice to give this PHUSE working group more of a shout out! A cross-industry initiative to document differences between programming languages.

bms63 · 2023-07-30T19:27:10Z

SO I first encountered this issue when double programming in R for TLGs made in SAS. I think you experienced the same issue and I think that should be mentioned here to set the stage.

bms63 · 2023-07-30T19:28:37Z

Maybe change this to Exploring Rounding Options. Bug Fix makes it seem like R is doing something wrong and SAS is doing it right.

bms63 · 2023-07-30T19:32:17Z

The post just sort of ends abruptly - anyway we could tie it up with your current experiences/summary of the post?

This one is going to be awesome!!!

kaz462 · 2023-08-01T02:37:02Z

Maybe change this to Exploring Rounding Options. Bug Fix makes it seem like R is doing something wrong and SAS is doing it right.

what do you think of changing "Bug Fix" to "Numerical precision issue"?
I also linked a SAS documentation on this topic: Numerical Accuracy in SAS Software

kaz462 · 2023-08-01T02:52:18Z

@manciniedoardo @bms63 Thanks so much for your review and comments!

I have limited knowledge on computational precision and floating-point, so I didn't go too deep on this topic, let me know if everything makes sense by referencing multiple resources.

bms63

I think option 2 for the table would be helpful, but I'm okay if it doesn't happen.

The discussion on a clean compares between two languages is interesting. Maybe a follow up post?

Perhaps leave them with a question at the end of the post. How concerned should we be if a programs in R and and SAS are off ...

kaz462 · 2023-08-02T19:56:38Z

Perhaps leave them with a question at the end of the post. How concerned should we be if a programs in R and and SAS are off ...

Good idea on the open-ended question. The following example illustrate the "safe" options I mentioned in the last section doesn't work as expected, but I'm not sure if this will ever happen in our daily work.

In this example, a is a value slightly less than 1.5. So if we choose round half up approach with 0 decimal places, output 1 is expected, but because + sqrt(.Machine$double.eps) is used, we get 2 as the result.

> a <- 1.5 - 0.5*sqrt(.Machine$double.eps)
> janitor::round_half_up(a, digits = 0)
[1] 2

If it happens, maybe we could argue a and 1.5 are nearly equal by using all.equal() function from base R with default tolerance value (default tolerance = sqrt(.Machine$double.eps)

> all.equal(a, 1.5)
[1] TRUE

@manciniedoardo @rossfarrugia @bms63 depends on the situation, maybe discrepancies like this example can be accepted?
should I add this example in the last section?

clarkliming · 2023-08-03T01:00:06Z

posts/2023-07-24_rounding/rounding.qmd

+After the fix:
+
+```{r, message = FALSE}
+# revised rounds half up


personally, I don't think we should do this. this will change a lot of behaviors, like ut_round1(2436.84499999999999, 2) gives 2436.85 while round(2436.84499999999999, 2) gives 2436.84 (which is correct)

@clarkliming Thanks for the review! your example is a similar case to the one mentioned above: a <- 1.5 - 0.5*sqrt(.Machine$double.eps)

The reason for this behavior is that 2436.845 - 2436.84499999999999 is less than sqrt(.Machine$double.eps), the adjustment in the function is not big enough to round it up. When we encounter this situation, maybe we could handle it by the following options -

accept this value and explain: 2436.845 and 2436.84499999999999 are nearly equal by using all.equal() function from base R with default tolerance value (default tolerance = sqrt(.Machine$double.eps)

the output from SAS round(2436.84499999999999, 0.01) is also 2436.85

choose a smaller value than sqrt(.Machine$double.eps) or remove it to adjust this function

How would you do "round half up" in R if we don't choose those functions (ut_round1, janitor::round_half_up, ...)?

I assume this function will be used deeply inside other functions and large-scale used, so controlling the tolerance manually is not likely(which means that users can only apply one tolerance to all the numeric results obtained).

I will split your question into two: one is how do we deal with round half up, the other is how do we deal with rounding error.

My response to these two questions is the same: no action is needed.

For question 1, R is following IEEE 754, with default bankers rounding. What SAS provided seems reasonable, however, R is not wrong. This is just a different method. R does not need to copy SAS.

For question 2, about the accuracy loss in computation, I will also do nothing. This is an issue by design, and not possible to perfectly solve at the moment.

Both issues won't affect the interpretation of the result

So for me, introducing the fact that R and SAS are different in rounding should be sufficient.

Come back to the issue. The function ut_round1 tries to solve the rounding error arising from the design, but it did not. So it is important to also mention this point. (also, this issue is not possible to solve in the current design, this fix only fixes one part of the issue)

I'm not trying to claim that base R's round function is wrong or we should strictly mimic SAS's behavior, but to provide round half up options in R. Similar as the intention of providing rounding option in Tplyr, offering users more flexibility - there may be contexts where 'round half up' is required.

What do you think of using the following examples to highlight those round_half_up functions do not offer the same level of precision and accuracy as the base R round function?
Please feel free to update this post directly, or let me know how to clarify this point :)

> a <- 1.5 - 0.5*sqrt(.Machine$double.eps) > b <- 1.5 - 0.5*.Machine$double.eps > janitor::round_half_up(a) [1] 2 > janitor::round_half_up(b) [1] 2 > round(a) [1] 1 # base round reaches the precision limit: > round(b) [1] 2

# Conflicts: # inst/WORDLIST.txt

bms63

This is all good for me! @rossfarrugia @StefanThoma any last concerns or can we merge in and publish?

rossfarrugia · 2023-08-22T13:52:12Z

Fine from my side 👍

kaz462 added 2 commits July 24, 2023 21:27

#71: rounding draft

8a1f540

Merge branch 'main' into 71_rounding

dba57ac

kaz462 requested review from bms63, StefanThoma and manciniedoardo July 25, 2023 04:32

manciniedoardo requested changes Jul 27, 2023

View reviewed changes

bms63 reviewed Jul 30, 2023

View reviewed changes

#71: address comments

b129ca3

bms63 self-requested a review August 2, 2023 17:57

bms63 approved these changes Aug 2, 2023

View reviewed changes

kaz462 added 2 commits August 2, 2023 11:58

#71: split the summary table

448d198

#71: update .lycheeignore

c973204

clarkliming reviewed Aug 3, 2023

View reviewed changes

kaz462 added 5 commits August 21, 2023 18:16

#71: update the last section

baae1b9

Merge branch 'main' into 71_rounding

9dd2936

# Conflicts: # inst/WORDLIST.txt

#71: update wordlist

3abb047

#71: styler/link

53a5992

#71: update wordlist

1a031b6

kaz462 requested review from clarkliming, rossfarrugia and manciniedoardo August 22, 2023 01:32

kaz462 requested a review from bms63 August 22, 2023 01:32

bms63 approved these changes Aug 22, 2023

View reviewed changes

bms63 merged commit c0a3c22 into main Aug 22, 2023
4 checks passed

bms63 deleted the 71_rounding branch August 22, 2023 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Closes #71 rounding #72

Closes #71 rounding #72

kaz462 commented Jul 25, 2023

manciniedoardo left a comment

manciniedoardo Jul 27, 2023

kaz462 Aug 1, 2023

rossfarrugia Aug 2, 2023

kaz462 Aug 2, 2023

rossfarrugia Aug 3, 2023

kaz462 Aug 4, 2023

bms63 left a comment •

edited

Loading

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023 •

edited

Loading

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023

kaz462 commented Aug 1, 2023

kaz462 commented Aug 1, 2023

bms63 left a comment

kaz462 commented Aug 2, 2023

clarkliming Aug 3, 2023

kaz462 Aug 3, 2023

clarkliming Aug 3, 2023

kaz462 Aug 4, 2023 •

edited

Loading

bms63 left a comment

rossfarrugia commented Aug 22, 2023

Closes #71 rounding #72

Closes #71 rounding #72

Conversation

kaz462 commented Jul 25, 2023

manciniedoardo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bms63 left a comment • edited Loading

Choose a reason for hiding this comment

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023 • edited Loading

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023

bms63 commented Jul 30, 2023

kaz462 commented Aug 1, 2023

kaz462 commented Aug 1, 2023

bms63 left a comment

Choose a reason for hiding this comment

kaz462 commented Aug 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaz462 Aug 4, 2023 • edited Loading

Choose a reason for hiding this comment

bms63 left a comment

Choose a reason for hiding this comment

rossfarrugia commented Aug 22, 2023

bms63 left a comment •

edited

Loading

bms63 commented Jul 30, 2023 •

edited

Loading

kaz462 Aug 4, 2023 •

edited

Loading